Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spark] Define and use DeltaTableV2.startTransaction helper method #2053

Conversation

ryan-johnson-databricks
Copy link
Collaborator

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

As part of making Delta more catalog-aware (see #2052), we need to solve two basic problems:

  1. When code calls the TableIdentifier overload of DeltaLog.forTable, the catalog lookup is performed internally and immediately discarded after extracting the table's storage location from it. If the caller needed the catalog info, they are out of luck. Caller can avoid the problem by creating a DeltaTableV2 instead, which already provides both DeltaLog and CatalogTable. To support this use case, we define new helper methods that make this easy to do (especially for unit tests).
  2. Even if we have a DeltaTableV2, there's no convenient way to start a transaction from it, in order to pass along the catalog info. To make it easy for callers to do the right thing, we define new helper methods for starting transactions directly from the DeltaTableV2 itself. When transactions eventually become aware of catalog info, these new helper methods will make a narrow waist that can be enhanced to pass along their catalog info.

How was this patch tested?

Existing unit tests (most changes are anyway in test code).

Does this PR introduce any user-facing changes?

Internal APIs only.

@scottsand-db
Copy link
Collaborator

LGTM!

Copy link
Contributor

@jaceklaskowski jaceklaskowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM excepts some nits

startTransaction(Some(snapshot))

/**
* Starts a transaction for this table, using Some provided snapshot, or a fresh snapshot if None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Some/some

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was intentional... Some/None as in Option.

* Starts a transaction for this table, using Some provided snapshot, or a fresh snapshot if None
* was provided.
*/
def startTransaction(snapshotOpt: Option[Snapshot] = None): OptimisticTransaction = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/snapshotOpt/snapshot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants